Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation

نویسندگان

Jaime Lorenzo-Trueba

Julián D. Echeverry-Correa

Roberto Barra-Chicote

Rubén San-Segundo-Hernández

Javier Ferreiros

Ascensión Gallardo-Antolín

Junichi Yamagishi

Simon King

Juan Manuel Montero-Martínez

چکیده

One of the biggest challenges in speech synthesis is the production of contextually-appropriate naturally sounding synthetic voices. This means that a Text-To-Speech system must be able to analyze a text beyond the sentence limits in order to select, or even modulate, the speaking style according to a broader context. Our current architecture is based on a two-step approach: text genre identification and speaking style synthesis according to the detected discourse genre. For the final implementation, a set of four genres and their corresponding speaking styles were considered: broadcast news, live sport commentaries, interviews and political speeches. In the final TTS evaluation, the four speaking styles were transplanted to the neutral voices of other speakers not included in the training database. When the transplanted styles were compared to the neutral voices, transplantation was significantly preferred and the similarity to the target speaker was as high as 78%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic prosodic modeling for speaker and task adaptation in text-to-speech

One of the most important demands for future TTS systems is their ability to improve naturalness when embedded in a particular task or application that requires a particular speaking style for a particular speaker. In this paper, we present a new prosodic modeling procedure for improving naturalness by adapting a TTS system to a new speaker and a new speaking style. The proposed procedure is an...

متن کامل

SDBM-Based Speaker Recognition for Speaking Style Variations

There are many factors corresponding to performance degradation of an actual speaker recognition system. Mismatch in speaking style of a target speaker during training and testing is an important one. When a client enrolls in a system, it is natural for him/her to speak in a spontaneous way. However, it is difficult to maintain the same speaking style throughout test phases. In view of this sit...

متن کامل

A Model for Varying Speaking Style in TTS systems

This paper aims to enhance the performance of a TTS system by generating various speaking styles. First we describe three speaking styles (Radio News, Political Address and Conversation) and compare the prosodic features found in these authentic styles with the prosody in “neutral” speech uttered by the eLite TTS system ([1]). Differences concern about 20 prosodic characteristics (F0 span, spee...

متن کامل

Adding speaking style to a TTS system

متن کامل

Design and evaluation of validity of an electronic alternative and augmentative communication system for Persian-speaking children

Introduction: Due to the high prevalence of communication disorders, augmentative and alternative communication methods are one the options ahead to solve the problems of these people. Since there are no complex tools for Persian-speaking children with communication disorders, we decided to design communication assistant software for these children that produces sound output. Materials and Meth...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation

نویسندگان

چکیده

منابع مشابه

Automatic prosodic modeling for speaker and task adaptation in text-to-speech

SDBM-Based Speaker Recognition for Speaking Style Variations

A Model for Varying Speaking Style in TTS systems

Adding speaking style to a TTS system

Design and evaluation of validity of an electronic alternative and augmentative communication system for Persian-speaking children

عنوان ژورنال:

اشتراک گذاری